Toward Acoustic Models for Languages with Limited Linguistic Resources
نویسندگان
چکیده
This paper discuses preliminary results on acoustic models creation through acoustic models already in existence for another language. In this work we show as case of study, the creation of acoustic models for Mexican Spanish, tagging automatically the training corpus with a recognition system for French. The resulting set of acoustic models for Mexican Spanish has gathered promising results at the phonetic level, reaching a recognition rate of 71.81%.
منابع مشابه
Pronunciation Lexicon Development for Under-Resourced Languages Using Automatically Derived Subword Units: A Case Study on Scottish Gaelic
Developing a phonetic lexicon for a language requires linguistic knowledge as well as human effort, which may not be available, particularly for under-resourced languages. To avoid the need for the linguistic knowledge, acoustic information can be used to automatically obtain the subword units and the associated pronunciations. Towards that, the present paper investigates the potential of a rec...
متن کاملGenre analysis of literature research article abstracts: A cross-linguistic, cross-cultural study
Following Swales’s (1981) works on genre analysis, studies on different sections of Research Articles (RAs) in various languages and fields abound; however, only scant attention has been directed toward abstracts written in Persian, and in the field of literature. Moreover, claims made by Lores (2004) regarding the correspondence of two types of abstracts with different ...
متن کاملLanguage independent and unsupervised acoustic models for speech recognition and keyword spotting
Developing high-performance speech processing systems for low-resource languages is very challenging. One approach to address the lack of resources is to make use of data from multiple languages. A popular direction in recent years is to train a multi-language bottleneck DNN. Language dependent and/or multi-language (all training languages) Tandem acoustic models are then trained. This work con...
متن کاملAn automated linguistic knowledge-based cross-language transfer method for building acoustic models for a language without native training data
In this paper we describe an automated, linguistic knowledgebased method for building acoustic models for a target language for which there is no native training data. The method assumes availability of well-trained acoustic models for a number of existing source languages. It employs statistically derived phonetic and phonological distance metrics, particularly a combined phonetic-phonological...
متن کاملJoint multilingual learning for coreference resolution
Natural language is a pervasive human skill not yet fully achievable by automated computing systems. The main challenge is understanding how to computationally model both the depth and the breadth of natural languages. In this thesis, I present two probabilistic models that systematically model both the depth and the breadth of natural languages for two different linguistic tasks: syntactic par...
متن کامل